Overview

Brought to you by YData

Dataset statistics

Number of variables13
Number of observations899164
Missing cells0
Missing cells (%)0.0%
Duplicate rows13439
Duplicate rows (%)1.5%
Total size in memory89.2 MiB
Average record size in memory104.0 B

Variable types

Numeric8
Categorical5

Dataset

DescriptionThis profiling report was copied from 2025 Misiriya's KaggleX BIPOC project
URL

Alerts

Dataset has 13439 (1.5%) duplicate rowsDuplicates
ApprovalFY is highly overall correlated with RetainedJobHigh correlation
GrAppv is highly overall correlated with TermHigh correlation
RetainedJob is highly overall correlated with ApprovalFYHigh correlation
Term is highly overall correlated with GrAppvHigh correlation
FranchiseCode is highly imbalanced (68.2%) Imbalance
NoEmp is highly skewed (γ1 = 80.24824355) Skewed
CreateJob is highly skewed (γ1 = 36.99135473) Skewed
RetainedJob is highly skewed (γ1 = 36.85481184) Skewed
NAICS has 201948 (22.5%) zeros Zeros
CreateJob has 629248 (70.0%) zeros Zeros
RetainedJob has 440403 (49.0%) zeros Zeros

Reproduction

Analysis started2025-02-14 06:17:07.151783
Analysis finished2025-02-14 06:17:28.625767
Duration21.47 seconds
Software versionydata-profiling vv4.12.2
Download configurationconfig.json

Variables

State
Real number (ℝ)

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1199.5479
Minimum111
Maximum4633
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2025-02-14T07:17:28.688648image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum111
5-th percentile301
Q1612
median1314
Q31518
95-th percentile2301
Maximum4633
Range4522
Interquartile range (IQR)906

Descriptive statistics

Standard deviation648.31203
Coefficient of variation (CV)0.54046362
Kurtosis-1.0275286
Mean1199.5479
Median Absolute Deviation (MAD)495
Skewness-0.07158061
Sum1.0785903 × 109
Variance420308.48
MonotonicityNot monotonic
2025-02-14T07:17:28.802910image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
301 130619
 
14.5%
2024 70458
 
7.8%
1425 57693
 
6.4%
612 41212
 
4.6%
1601 35170
 
3.9%
1508 32622
 
3.6%
912 29669
 
3.3%
1301 25272
 
2.8%
1314 24373
 
2.7%
1410 24035
 
2.7%
Other values (42) 428041
47.6%
ValueCountFrequency (%)
111 2405
 
0.3%
112 8362
 
0.9%
118 6341
 
0.7%
126 17631
 
2.0%
301 130619
14.5%
315 20605
 
2.3%
320 12229
 
1.4%
403 1613
 
0.2%
405 2220
 
0.2%
612 41212
 
4.6%
ValueCountFrequency (%)
4633 14
 
< 0.1%
2325 2839
 
0.3%
2322 3287
 
0.4%
2309 21040
 
2.3%
2301 23263
 
2.6%
2220 5454
 
0.6%
2201 13264
 
1.5%
2120 18776
 
2.1%
2024 70458
7.8%
2014 9403
 
1.0%

NAICS
Real number (ℝ)

Zeros 

Distinct25
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39.612263
Minimum0
Maximum92
Zeros201948
Zeros (%)22.5%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2025-02-14T07:17:28.878722image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q123
median44
Q356
95-th percentile81
Maximum92
Range92
Interquartile range (IQR)33

Descriptive statistics

Standard deviation26.284706
Coefficient of variation (CV)0.66354972
Kurtosis-1.0572678
Mean39.612263
Median Absolute Deviation (MAD)18
Skewness-0.24819754
Sum35617921
Variance690.88577
MonotonicityNot monotonic
2025-02-14T07:17:28.943039image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
0 201948
22.5%
44 84737
9.4%
81 72618
 
8.1%
54 68170
 
7.6%
72 67600
 
7.5%
23 66646
 
7.4%
62 55366
 
6.2%
42 48743
 
5.4%
45 42514
 
4.7%
33 38284
 
4.3%
Other values (15) 152538
17.0%
ValueCountFrequency (%)
0 201948
22.5%
11 9005
 
1.0%
21 1851
 
0.2%
22 663
 
0.1%
23 66646
 
7.4%
31 11809
 
1.3%
32 17936
 
2.0%
33 38284
 
4.3%
42 48743
 
5.4%
44 84737
9.4%
ValueCountFrequency (%)
92 229
 
< 0.1%
81 72618
8.1%
72 67600
7.5%
71 14640
 
1.6%
62 55366
6.2%
61 6425
 
0.7%
56 32685
3.6%
55 257
 
< 0.1%
54 68170
7.6%
53 13632
 
1.5%

ApprovalFY
Real number (ℝ)

High correlation 

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2001.1436
Minimum1962
Maximum2014
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2025-02-14T07:17:29.021582image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1962
5-th percentile1991
Q11997
median2002
Q32006
95-th percentile2009
Maximum2014
Range52
Interquartile range (IQR)9

Descriptive statistics

Standard deviation5.9138459
Coefficient of variation (CV)0.0029552332
Kurtosis-0.092531047
Mean2001.1436
Median Absolute Deviation (MAD)4
Skewness-0.58537855
Sum1.7993562 × 109
Variance34.973573
MonotonicityNot monotonic
2025-02-14T07:17:29.110673image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2005 77525
 
8.6%
2006 76040
 
8.5%
2007 71876
 
8.0%
2004 68290
 
7.6%
2003 58193
 
6.5%
1995 45758
 
5.1%
2002 44391
 
4.9%
1996 40112
 
4.5%
2008 39540
 
4.4%
1997 37748
 
4.2%
Other values (41) 339691
37.8%
ValueCountFrequency (%)
1962 1
 
< 0.1%
1965 1
 
< 0.1%
1966 1
 
< 0.1%
1967 2
 
< 0.1%
1968 2
 
< 0.1%
1969 4
 
< 0.1%
1970 8
 
< 0.1%
1971 20
 
< 0.1%
1972 27
< 0.1%
1973 52
< 0.1%
ValueCountFrequency (%)
2014 268
 
< 0.1%
2013 2458
 
0.3%
2012 5997
 
0.7%
2011 12608
 
1.4%
2010 16848
 
1.9%
2009 19126
 
2.1%
2008 39540
4.4%
2007 71876
8.0%
2006 76040
8.5%
2005 77525
8.6%

Term
Real number (ℝ)

High correlation 

Distinct412
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean110.77308
Minimum0
Maximum569
Zeros810
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2025-02-14T07:17:29.193881image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile16
Q160
median84
Q3120
95-th percentile300
Maximum569
Range569
Interquartile range (IQR)60

Descriptive statistics

Standard deviation78.857305
Coefficient of variation (CV)0.7118815
Kurtosis0.18570424
Mean110.77308
Median Absolute Deviation (MAD)33
Skewness1.1209258
Sum99603164
Variance6218.4746
MonotonicityNot monotonic
2025-02-14T07:17:29.280833image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
84 230162
25.6%
60 89945
 
10.0%
240 85982
 
9.6%
120 77654
 
8.6%
300 44727
 
5.0%
180 28164
 
3.1%
36 19800
 
2.2%
12 17095
 
1.9%
48 15621
 
1.7%
72 9419
 
1.0%
Other values (402) 280595
31.2%
ValueCountFrequency (%)
0 810
 
0.1%
1 1608
0.2%
2 1809
0.2%
3 2112
0.2%
4 2173
0.2%
5 1866
0.2%
6 3054
0.3%
7 1761
0.2%
8 1693
0.2%
9 1875
0.2%
ValueCountFrequency (%)
569 1
< 0.1%
527 1
< 0.1%
511 1
< 0.1%
505 1
< 0.1%
481 1
< 0.1%
480 1
< 0.1%
461 1
< 0.1%
449 1
< 0.1%
445 1
< 0.1%
443 1
< 0.1%

NoEmp
Real number (ℝ)

Skewed 

Distinct599
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.411353
Minimum0
Maximum9999
Zeros6631
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2025-02-14T07:17:29.369831image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q310
95-th percentile40
Maximum9999
Range9999
Interquartile range (IQR)8

Descriptive statistics

Standard deviation74.108196
Coefficient of variation (CV)6.4942514
Kurtosis7965.2886
Mean11.411353
Median Absolute Deviation (MAD)3
Skewness80.248244
Sum10260678
Variance5492.0248
MonotonicityNot monotonic
2025-02-14T07:17:29.458379image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 154254
17.2%
2 138297
15.4%
3 90674
10.1%
4 73644
 
8.2%
5 60319
 
6.7%
6 45759
 
5.1%
10 31536
 
3.5%
7 31495
 
3.5%
8 31361
 
3.5%
12 20822
 
2.3%
Other values (589) 221003
24.6%
ValueCountFrequency (%)
0 6631
 
0.7%
1 154254
17.2%
2 138297
15.4%
3 90674
10.1%
4 73644
8.2%
5 60319
 
6.7%
6 45759
 
5.1%
7 31495
 
3.5%
8 31361
 
3.5%
9 18131
 
2.0%
ValueCountFrequency (%)
9999 4
< 0.1%
9992 1
 
< 0.1%
9945 1
 
< 0.1%
9090 1
 
< 0.1%
9000 2
 
< 0.1%
8500 1
 
< 0.1%
8041 1
 
< 0.1%
8018 1
 
< 0.1%
8000 7
< 0.1%
7999 1
 
< 0.1%

NewExist
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.9 MiB
1
644869 
2
253125 
0
 
1170

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters899164
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 644869
71.7%
2 253125
 
28.2%
0 1170
 
0.1%

Length

2025-02-14T07:17:29.532584image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-14T07:17:29.574675image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
1 644869
71.7%
2 253125
 
28.2%
0 1170
 
0.1%

Most occurring characters

ValueCountFrequency (%)
1 644869
71.7%
2 253125
 
28.2%
0 1170
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 644869
71.7%
2 253125
 
28.2%
0 1170
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 644869
71.7%
2 253125
 
28.2%
0 1170
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 644869
71.7%
2 253125
 
28.2%
0 1170
 
0.1%

CreateJob
Real number (ℝ)

Skewed  Zeros 

Distinct246
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.4303764
Minimum0
Maximum8800
Zeros629248
Zeros (%)70.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2025-02-14T07:17:29.641907image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile10
Maximum8800
Range8800
Interquartile range (IQR)1

Descriptive statistics

Standard deviation236.68817
Coefficient of variation (CV)28.075634
Kurtosis1369.911
Mean8.4303764
Median Absolute Deviation (MAD)0
Skewness36.991355
Sum7580291
Variance56021.288
MonotonicityNot monotonic
2025-02-14T07:17:29.727655image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 629248
70.0%
1 63174
 
7.0%
2 57831
 
6.4%
3 28806
 
3.2%
4 20511
 
2.3%
5 18691
 
2.1%
10 11602
 
1.3%
6 11009
 
1.2%
8 7378
 
0.8%
7 6374
 
0.7%
Other values (236) 44540
 
5.0%
ValueCountFrequency (%)
0 629248
70.0%
1 63174
 
7.0%
2 57831
 
6.4%
3 28806
 
3.2%
4 20511
 
2.3%
5 18691
 
2.1%
6 11009
 
1.2%
7 6374
 
0.7%
8 7378
 
0.8%
9 3330
 
0.4%
ValueCountFrequency (%)
8800 648
0.1%
5621 1
 
< 0.1%
5199 1
 
< 0.1%
5085 1
 
< 0.1%
3500 1
 
< 0.1%
3100 1
 
< 0.1%
3000 4
 
< 0.1%
2515 1
 
< 0.1%
2140 1
 
< 0.1%
2020 1
 
< 0.1%

RetainedJob
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct358
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.797257
Minimum0
Maximum9500
Zeros440403
Zeros (%)49.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2025-02-14T07:17:29.811311image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q34
95-th percentile20
Maximum9500
Range9500
Interquartile range (IQR)4

Descriptive statistics

Standard deviation237.1206
Coefficient of variation (CV)21.961188
Kurtosis1362.0182
Mean10.797257
Median Absolute Deviation (MAD)1
Skewness36.854812
Sum9708505
Variance56226.179
MonotonicityNot monotonic
2025-02-14T07:17:29.897982image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 440403
49.0%
1 88790
 
9.9%
2 76851
 
8.5%
3 49963
 
5.6%
4 39666
 
4.4%
5 32627
 
3.6%
6 23796
 
2.6%
7 16530
 
1.8%
8 15698
 
1.7%
10 15438
 
1.7%
Other values (348) 99402
 
11.1%
ValueCountFrequency (%)
0 440403
49.0%
1 88790
 
9.9%
2 76851
 
8.5%
3 49963
 
5.6%
4 39666
 
4.4%
5 32627
 
3.6%
6 23796
 
2.6%
7 16530
 
1.8%
8 15698
 
1.7%
9 8735
 
1.0%
ValueCountFrequency (%)
9500 1
 
< 0.1%
8800 648
0.1%
7250 1
 
< 0.1%
5000 1
 
< 0.1%
4441 1
 
< 0.1%
4000 2
 
< 0.1%
3900 1
 
< 0.1%
3860 1
 
< 0.1%
3225 1
 
< 0.1%
3200 1
 
< 0.1%

FranchiseCode
Categorical

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.9 MiB
0
847389 
1
 
51775

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters899164
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 847389
94.2%
1 51775
 
5.8%

Length

2025-02-14T07:17:29.969916image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-14T07:17:30.008612image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0 847389
94.2%
1 51775
 
5.8%

Most occurring characters

ValueCountFrequency (%)
0 847389
94.2%
1 51775
 
5.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 847389
94.2%
1 51775
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 847389
94.2%
1 51775
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 847389
94.2%
1 51775
 
5.8%

RevLineCr
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.9 MiB
1
420288 
0
262195 
3
201397 
2
 
15284

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters899164
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 420288
46.7%
0 262195
29.2%
3 201397
22.4%
2 15284
 
1.7%

Length

2025-02-14T07:17:30.055393image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-14T07:17:30.099972image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
1 420288
46.7%
0 262195
29.2%
3 201397
22.4%
2 15284
 
1.7%

Most occurring characters

ValueCountFrequency (%)
1 420288
46.7%
0 262195
29.2%
3 201397
22.4%
2 15284
 
1.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 420288
46.7%
0 262195
29.2%
3 201397
22.4%
2 15284
 
1.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 420288
46.7%
0 262195
29.2%
3 201397
22.4%
2 15284
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 420288
46.7%
0 262195
29.2%
3 201397
22.4%
2 15284
 
1.7%

LowDoc
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.9 MiB
0
788829 
1
110335 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters899164
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 788829
87.7%
1 110335
 
12.3%

Length

2025-02-14T07:17:30.159611image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-14T07:17:30.198282image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0 788829
87.7%
1 110335
 
12.3%

Most occurring characters

ValueCountFrequency (%)
0 788829
87.7%
1 110335
 
12.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 788829
87.7%
1 110335
 
12.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 788829
87.7%
1 110335
 
12.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 788829
87.7%
1 110335
 
12.3%

MIS_Status
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.9 MiB
1
741345 
0
157819 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters899164
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 741345
82.4%
0 157819
 
17.6%

Length

2025-02-14T07:17:30.246904image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-14T07:17:30.287078image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
1 741345
82.4%
0 157819
 
17.6%

Most occurring characters

ValueCountFrequency (%)
1 741345
82.4%
0 157819
 
17.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 741345
82.4%
0 157819
 
17.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 741345
82.4%
0 157819
 
17.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 899164
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 741345
82.4%
0 157819
 
17.6%

GrAppv
Real number (ℝ)

High correlation 

Distinct22128
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean192686.98
Minimum200
Maximum5472000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2025-02-14T07:17:30.346233image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum200
5-th percentile10000
Q135000
median90000
Q3225000
95-th percentile750000
Maximum5472000
Range5471800
Interquartile range (IQR)190000

Descriptive statistics

Standard deviation283263.39
Coefficient of variation (CV)1.4700702
Kurtosis21.018882
Mean192686.98
Median Absolute Deviation (MAD)65000
Skewness3.5207901
Sum1.7325719 × 1011
Variance8.0238149 × 1010
MonotonicityNot monotonic
2025-02-14T07:17:30.434446image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50000 69394
 
7.7%
25000 51258
 
5.7%
100000 50977
 
5.7%
10000 38366
 
4.3%
150000 27624
 
3.1%
20000 23434
 
2.6%
35000 23181
 
2.6%
30000 21004
 
2.3%
5000 19146
 
2.1%
15000 18472
 
2.1%
Other values (22118) 556308
61.9%
ValueCountFrequency (%)
200 2
 
< 0.1%
300 1
 
< 0.1%
400 2
 
< 0.1%
500 33
 
< 0.1%
700 4
 
< 0.1%
800 4
 
< 0.1%
950 1
 
< 0.1%
1000 444
< 0.1%
1200 12
 
< 0.1%
1300 2
 
< 0.1%
ValueCountFrequency (%)
5472000 1
 
< 0.1%
5000000 40
< 0.1%
4991700 1
 
< 0.1%
4950000 1
 
< 0.1%
4908500 1
 
< 0.1%
4900000 2
 
< 0.1%
4872000 1
 
< 0.1%
4869000 1
 
< 0.1%
4830000 1
 
< 0.1%
4800000 1
 
< 0.1%

Interactions

2025-02-14T07:17:26.121439image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:18.439155image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:19.624686image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:20.700921image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:21.835239image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:22.931254image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:24.000664image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:25.054593image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:26.266512image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:18.574230image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:19.755445image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:20.865647image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:21.971365image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:23.071845image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:24.128479image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:25.183651image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:26.400064image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:18.709751image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:19.886890image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:21.003752image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:22.108011image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:23.201893image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:24.264872image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:25.336692image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:26.540550image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:18.843776image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:20.024674image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:21.144591image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:22.256613image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:23.337537image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:24.396146image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:25.466808image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:26.675607image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:18.986641image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:20.169746image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:21.281379image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:22.392299image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:23.466271image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:24.527827image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:25.594754image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:26.916756image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:19.214998image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:20.303908image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:21.415933image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:22.527931image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:23.599184image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:24.666997image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:25.721857image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:27.050591image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:19.350104image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:20.433357image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:21.548722image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:22.664236image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:23.727367image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:24.798315image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:25.858465image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:27.183389image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:19.487959image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:20.564162image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:21.686038image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:22.800728image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:23.869368image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:24.925901image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-14T07:17:25.984330image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-02-14T07:17:30.613107image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ApprovalFYCreateJobFranchiseCodeGrAppvLowDocMIS_StatusNAICSNewExistNoEmpRetainedJobRevLineCrStateTerm
ApprovalFY1.0000.2680.048-0.3000.3750.3270.4400.062-0.2260.5460.359-0.001-0.297
CreateJob0.2681.0000.0010.0930.0100.0120.1560.0090.0340.3770.016-0.0320.082
FranchiseCode0.0480.0011.0000.0650.0280.0150.2220.1420.0020.0040.1290.0360.105
GrAppv-0.3000.0930.0651.0000.1160.074-0.1420.0370.455-0.1380.099-0.0670.558
LowDoc0.3750.0100.0280.1161.0000.0840.1540.1610.0030.0100.2260.0870.169
MIS_Status0.3270.0120.0150.0740.0841.0000.1480.0220.0040.0130.1460.0540.491
NAICS0.4400.1560.222-0.1420.1540.1481.0000.093-0.1510.2680.214-0.000-0.076
NewExist0.0620.0090.1420.0370.1610.0220.0931.0000.0040.0020.0650.0700.088
NoEmp-0.2260.0340.0020.4550.0030.004-0.1510.0041.0000.1240.005-0.0400.200
RetainedJob0.5460.3770.004-0.1380.0100.0130.2680.0020.1241.0000.016-0.030-0.157
RevLineCr0.3590.0160.1290.0990.2260.1460.2140.0650.0050.0161.0000.0460.242
State-0.001-0.0320.036-0.0670.0870.054-0.0000.070-0.040-0.0300.0461.000-0.088
Term-0.2970.0820.1050.5580.1690.491-0.0760.0880.200-0.1570.242-0.0881.000

Missing values

2025-02-14T07:17:27.289098image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-02-14T07:17:27.755711image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

StateNAICSApprovalFYTermNoEmpNewExistCreateJobRetainedJobFranchiseCodeRevLineCrLowDocMIS_StatusGrAppv
0914451997844200011160000
1914721997602200011140000
291462199718071000101287000
3151101997602100011135000
461201997240141770101229000
5320331997120191000101517000
614100198045452000100600000
7612811997841200011145000
861272199729722000101305000
932001997843200011170000
StateNAICSApprovalFYTermNoEmpNewExistCreateJobRetainedJobFranchiseCodeRevLineCrLowDocMIS_StatusGrAppv
899154150801997601100000110000
899155142562199718021000001128000
89915613043319976020100000150000
89915730131199736401000101200000
899158202401997845200011179000
8991591508451997606100000170000
8991601508451997606100030185000
899161301331997108261000101300000
89916280901997606100011075000
89916380901997481200010130000

Duplicate rows

Most frequently occurring

StateNAICSApprovalFYTermNoEmpNewExistCreateJobRetainedJobFranchiseCodeRevLineCrLowDocMIS_StatusGrAppv# duplicates
1066616015420038411010301500032
8473010199884110003011000031
428561254200384110103011000030
431661254200484110103011000030
9929150854200860010003012500024
1088916018120038411010301500023
1069816015420048411010301500022
22293014820081214511241450301150020
249430154200484210203011000020
443961256200484110103011000018